359 research outputs found

    Backup without redundancy: genetic interactions reveal the cost of duplicate gene loss.

    Get PDF
    Many genes can be deleted with little phenotypic consequences. By what mechanism and to what extent the presence of duplicate genes in the genome contributes to this robustness against deletions has been the subject of considerable interest. Here, we exploit the availability of high-density genetic interaction maps to provide direct support for the role of backup compensation, where functionally overlapping duplicates cover for the loss of their paralog. However, we find that the overall contribution of duplicates to robustness against null mutations is low ( approximately 25%). The ability to directly identify buffering paralogs allowed us to further study their properties, and how they differ from non-buffering duplicates. Using environmental sensitivity profiles as well as quantitative genetic interaction spectra as high-resolution phenotypes, we establish that even duplicate pairs with compensation capacity exhibit rich and typically non-overlapping deletion phenotypes, and are thus unable to comprehensively cover against loss of their paralog. Our findings reconcile the fact that duplicates can compensate for each other's loss under a limited number of conditions with the evolutionary instability of genes whose loss is not associated with a phenotypic penalty

    Comparative gene expression analysis by differential clustering approach: application to the Candida albicans transcription program.

    Get PDF
    Differences in gene expression underlie many of the phenotypic variations between related organisms, yet approaches to characterize such differences on a genome-wide scale are not well developed. Here, we introduce the "differential clustering algorithm" for revealing conserved and diverged co-expression patterns. Our approach is applied at different levels of organization, ranging from pair-wise correlations within specific groups of functionally linked genes, to higher-order correlations between such groups. Using the differential clustering algorithm, we systematically compared the transcription program of the fungal pathogen Candida albicans with that of the model organism Saccharomyces cerevisiae. Many of the identified differences are related to the differential requirement for mitochondrial function in the two yeasts. Distinct regulation patterns of cell cycle genes and of amino acid metabolic genes were also revealed and, in some cases, could be linked to the differential appearance of cis-regulatory elements in the gene promoter regions. Our study provides a comprehensive framework for comparative gene expression analysis and a rich source of hypotheses for uncharacterized open reading frames and putative cis-regulatory elements in C. albicans

    Development and preliminary validation of a Family Nutrition and Physical Activity (FNPA) screening tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Parents directly influence children's physical activity and nutrition behaviors and also dictate the physical and social environments that are available to their children. This paper summarizes the development of an easy to use screening tool (The Family Nutrition and Physical Activity (FNPA) Screening Tool) designed to assess family environmental and behavioral factors that may predispose a child to becoming overweight.</p> <p>Methods</p> <p>The FNPA instrument was developed using constructs identified in a comprehensive evidence analysis conducted in collaboration with the American Dietetics Association. Two or three items were created for each of the ten constructs with evidence grades of II or higher. Parents of first grade students from a large urban school district (39 schools) were recruited to complete the FNPA screening tool and provide permission to link results to BMI data obtained from trained nurses in each school. A total of 1085 surveys were completed out of the available sample of 2189 children in the district. Factor analysis was conducted to examine the factor structure of the scale. Mixed model analyses were conducted on the composite FNPA score to determine if patterns in home environments and behaviors matched some of the expected socio-economic (SES) and ethnic patterns in BMI. Correlations among FNPA constructs and other main variables were computed to examine possible associations among the various factors. Finally, logistic regression was used to evaluate the construct validity of the FNPA scale.</p> <p>Results</p> <p>Factor analyses revealed the presence of a single factor and this unidimensional structure was supported by the correlation analyses. The correlations among constructs were consistently positive but the total score had higher correlations with child BMI than the other individual constructs. The FNPA scores followed expected demographic patterns with low income families reporting lower (less favorable) scores than moderate or high income families. Children with a total score in the lowest tertile (high risk family environment and behaviors) had an odds ratio (OR) of 1.7 (95% CI = 1.07 – 2.80) compared to children with a total score in the highest tertile (more favorable family environment and behaviors) but this effect was reduced when parent BMI was included as a covariate.</p> <p>Conclusion</p> <p>The results support the contention that the FNPA tool captures important elements of the family environment and behaviors that relate to risk for child overweight.</p

    Subgraphs and network motifs in geometric networks

    Full text link
    Many real-world networks describe systems in which interactions decay with the distance between nodes. Examples include systems constrained in real space such as transportation and communication networks, as well as systems constrained in abstract spaces such as multivariate biological or economic datasets and models of social networks. These networks often display network motifs: subgraphs that recur in the network much more often than in randomized networks. To understand the origin of the network motifs in these networks, it is important to study the subgraphs and network motifs that arise solely from geometric constraints. To address this, we analyze geometric network models, in which nodes are arranged on a lattice and edges are formed with a probability that decays with the distance between nodes. We present analytical solutions for the numbers of all 3 and 4-node subgraphs, in both directed and non-directed geometric networks. We also analyze geometric networks with arbitrary degree sequences, and models with a field that biases for directed edges in one direction. Scaling rules for scaling of subgraph numbers with system size, lattice dimension and interaction range are given. Several invariant measures are found, such as the ratio of feedback and feed-forward loops, which do not depend on system size, dimension or connectivity function. We find that network motifs in many real-world networks, including social networks and neuronal networks, are not captured solely by these geometric models. This is in line with recent evidence that biological network motifs were selected as basic circuit elements with defined information-processing functions.Comment: 9 pages, 6 figure

    The Iterative Signature Algorithm for the analysis of large scale gene expression data

    Full text link
    We present a new approach for the analysis of genome-wide expression data. Our method is designed to overcome the limitations of traditional techniques, when applied to large-scale data. Rather than alloting each gene to a single cluster, we assign both genes and conditions to context-dependent and potentially overlapping transcription modules. We provide a rigorous definition of a transcription module as the object to be retrieved from the expression data. An efficient algorithm, that searches for the modules encoded in the data by iteratively refining sets of genes and conditions until they match this definition, is established. Each iteration involves a linear map, induced by the normalized expression matrix, followed by the application of a threshold function. We argue that our method is in fact a generalization of Singular Value Decomposition, which corresponds to the special case where no threshold is applied. We show analytically that for noisy expression data our approach leads to better classification due to the implementation of the threshold. This result is confirmed by numerical analyses based on in-silico expression data. We discuss briefly results obtained by applying our algorithm to expression data from the yeast S. cerevisiae.Comment: Latex, 36 pages, 8 figure

    An optimization model for metabolic pathways

    Get PDF
    This article is available open access through the publisher’s website through the link below. Copyright @ The Author 2009.Motivation: Different mathematical methods have emerged in the post-genomic era to determine metabolic pathways. These methods can be divided into stoichiometric methods and path finding methods. In this paper we detail a novel optimization model, based upon integer linear programming, to determine metabolic pathways. Our model links reaction stoichiometry with path finding in a single approach. We test the ability of our model to determine 40 annotated Escherichia coli metabolic pathways. We show that our model is able to determine 36 of these 40 pathways in a computationally effective manner. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online (http://bioinformatics.oxfordjournals.org/cgi/content/full/btp441/DC1)

    The NBD-NBD interface is not the sole determinant for transport in ABC transporters

    Get PDF
    International audienceOne of the most exciting scientific challenges in functional genomics concerns the discovery of biologically relevant patterns from gene expression data. For instance, it is extremely useful to provide putative synexpression groups or transcription modules to molecular biologists. We propose a methodology that has been proved useful in real cases. It is described as a prototypical KDD scenario which starts from raw expression data selection until useful patterns are delivered. Our conceptual contribution is (a) to emphasize how to take the most from recent progress in constraint-based mining of set patterns, and (b) to propose a generic approach for gene expression data enrichment. The methodology has been validated on real data sets

    Discovering transcriptional modules by Bayesian data integration

    Get PDF
    Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs

    Characterization and Comparison of the Tissue-Related Modules in Human and Mouse

    Get PDF
    BACKGROUND: Due to the advances of high throughput technology and data-collection approaches, we are now in an unprecedented position to understand the evolution of organisms. Great efforts have characterized many individual genes responsible for the interspecies divergence, yet little is known about the genome-wide divergence at a higher level. Modules, serving as the building blocks and operational units of biological systems, provide more information than individual genes. Hence, the comparative analysis between species at the module level would shed more light on the mechanisms underlying the evolution of organisms than the traditional comparative genomics approaches. RESULTS: We systematically identified the tissue-related modules using the iterative signature algorithm (ISA), and we detected 52 and 65 modules in the human and mouse genomes, respectively. The gene expression patterns indicate that all of these predicted modules have a high possibility of serving as real biological modules. In addition, we defined a novel quantity, "total constraint intensity," a proxy of multiple constraints (of co-regulated genes and tissues where the co-regulation occurs) on the evolution of genes in module context. We demonstrate that the evolutionary rate of a gene is negatively correlated with its total constraint intensity. Furthermore, there are modules coding the same essential biological processes, while their gene contents have diverged extensively between human and mouse. CONCLUSIONS: Our results suggest that unlike the composition of module, which exhibits a great difference between human and mouse, the functional organization of the corresponding modules may evolve in a more conservative manner. Most importantly, our findings imply that similar biological processes can be carried out by different sets of genes from human and mouse, therefore, the functional data of individual genes from mouse may not apply to human in certain occasions
    corecore